125 research outputs found

    Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome

    Get PDF
    Transposable elements (TEs) have no longer been totally considered as β€œjunk DNA” for quite a time since the continual discoveries of their multifunctional roles in eukaryote genomes. As one of the most important and abundant TEs that still active in human genome, Alu, a SINE family, has demonstrated its indispensable regulatory functions at sequence level, but its spatial roles are still unclear. Technologies based on 3C(chromosomeconformation capture) have revealed the mysterious three-dimensional structure of chromatin, and make it possible to study the distal chromatin interaction in the genome. To find the role TE playing in distal regulation in human genome, we compiled the new released Hi-C data, TE annotation, histone marker annotations, and the genome-wide methylation data to operate correlation analysis, and found that the density of Alu elements showed a strong positive correlation with the level of chromatin interactions (hESC: r=0.9, P<2.2Γ—1016; IMR90 fibroblasts: r = 0.94, P < 2.2 Γ— 1016) and also have a significant positive correlation withsomeremote functional DNA elements like enhancers and promoters (Enhancer: hESC: r=0.997, P=2.3Γ—10βˆ’4; IMR90: r=0.934, P=2Γ—10βˆ’2; Promoter: hESC: r = 0.995, P = 3.8 Γ— 10βˆ’4; IMR90: r = 0.996, P = 3.2 Γ— 10βˆ’4). Further investigation involving GC content and methylation status showed the GC content of Alu covered sequences shared a similar pattern with that of the overall sequence, suggesting that Alu elements also function as the GC nucleotide and CpG site provider. In all, our results suggest that the Alu elements may act as an alternative parameter to evaluate the Hi-C data, which is confirmed by the correlation analysis of Alu elements and histone markers. Moreover, the GC-rich Alu sequence can bring high GC content and methylation flexibility to the regions with more distal chromatin contact, regulating the transcription of tissue-specific genes

    Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications.

    Get PDF
    Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA binding domain sequencing (MBD-seq). We applied all four methods to biological replicates of human embryonic stem cells to assess their genome-wide CpG coverage, resolution, cost, concordance and the influence of CpG density and genomic context. The methylation levels assessed by the two bisulfite methods were concordant (their difference did not exceed a given threshold) for 82% for CpGs and 99% of the non-CpG cytosines. Using binary methylation calls, the two enrichment methods were 99% concordant and regions assessed by all four methods were 97% concordant. We combined MeDIP-seq with methylation-sensitive restriction enzyme (MRE-seq) sequencing for comprehensive methylome coverage at lower cost. This, along with RNA-seq and ChIP-seq of the ES cells enabled us to detect regions with allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression

    Integrating transposable elements in the 3D genome

    Get PDF
    Chromosome organisation is increasingly recognised as an essential component of genome regulation, cell fate and cell health. Within the realm of transposable elements (TEs) however, the spatial information of how genomes are folded is still only rarely integrated in experimental studies or accounted for in modelling. Whilst polymer physics is recognised as an important tool to understand the mechanisms of genome folding, in this commentary we discuss its potential applicability to aspects of TE biology. Based on recent works on the relationship between genome organisation and TE integration, we argue that existing polymer models may be extended to create a predictive framework for the study of TE integration patterns. We suggest that these models may offer orthogonal and generic insights into the integration profiles (or "topography") of TEs across organisms. In addition, we provide simple polymer physics arguments and preliminary molecular dynamics simulations of TEs inserting into heterogeneously flexible polymers. By considering this simple model, we show how polymer folding and local flexibility may generically affect TE integration patterns. The preliminary discussion reported in this commentary is aimed to lay the foundations for a large-scale analysis of TE integration dynamics and topography as a function of the three-dimensional host genome

    Complex Loci in Human and Mouse Genomes

    Get PDF
    Mammalian genomes harbor a larger than expected number of complex loci, in which multiple genes are coupled by shared transcribed regions in antisense orientation and/or by bidirectional core promoters. To determine the incidence, functional significance, and evolutionary context of mammalian complex loci, we identified and characterized 5,248 cis–antisense pairs, 1,638 bidirectional promoters, and 1,153 chains of multiple cis–antisense and/or bidirectionally promoted pairs from 36,606 mouse transcriptional units (TUs), along with 6,141 cis–antisense pairs, 2,113 bidirectional promoters, and 1,480 chains from 42,887 human TUs. In both human and mouse, 25% of TUs resided in cis–antisense pairs, only 17% of which were conserved between the two organisms, indicating frequent species specificity of antisense gene arrangements. A sampling approach indicated that over 40% of all TUs might actually be in cis–antisense pairs, and that only a minority of these arrangements are likely to be conserved between human and mouse. Bidirectional promoters were characterized by variable transcriptional start sites and an identifiable midpoint at which overall sequence composition changed strand and the direction of transcriptional initiation switched. In microarray data covering a wide range of mouse tissues, genes in cis–antisense and bidirectionally promoted arrangement showed a higher probability of being coordinately expressed than random pairs of genes. In a case study on homeotic loci, we observed extensive transcription of nonconserved sequences on the noncoding strand, implying that the presence rather than the sequence of these transcripts is of functional importance. Complex loci are ubiquitous, host numerous nonconserved gene structures and lineage-specific exonification events, and may have a cis-regulatory impact on the member genes

    Systematic Search for Recipes to Generate Induced Pluripotent Stem Cells

    Get PDF
    Generation of induced pluripotent stem cells (iPSCs) opens a new avenue in regenerative medicine. One of the major hurdles for therapeutic applications is to improve the efficiency of generating iPSCs and also to avoid the tumorigenicity, which requires searching for new reprogramming recipes. We present a systems biology approach to efficiently evaluate a large number of possible recipes and find those that are most effective at generating iPSCs. We not only recovered several experimentally confirmed recipes but we also suggested new ones that may improve reprogramming efficiency and quality. In addition, our approach allows one to estimate the cell-state landscape, monitor the progress of reprogramming, identify important regulatory transition states, and ultimately understand the mechanisms of iPSC generation

    Structural determinants of the SINE B2 element embedded in the long non-coding RNA activator of translation AS Uchl1

    Get PDF
    Pervasive transcription of mammalian genomes leads to a previously underestimated level of complexity in gene regulatory networks. Recently, we have identified a new functional class of natural and synthetic antisense long non-coding RNAs (lncRNA) that increases translation of partially overlapping sense mRNAs. These molecules were named SINEUPs, as they require an embedded inverted SINE B2 element for their UP-regulation of translation. Mouse AS Uchl1 is the representative member of natural SINEUPs. It was originally discovered for its role in increasing translation of Uchl1 mRNA, a gene associated with neurodegenerative diseases. Here we present the secondary structure of the SINE B2 Transposable Element (TE) embedded in AS Uchl1. We find that specific structural regions, containing a short hairpin, are required for the ability of AS Uchl1 RNA to increase translation of its target mRNA. We also provide a high-resolution structure of the relevant hairpin, based on NMR observables. Our results highlight the importance of structural determinants in embedded TEs for their activity as functional domains in lncRNAs

    Fuzzy Tandem Repeats Containing p53 Response Elements May Define Species-Specific p53 Target Genes

    Get PDF
    Evolutionary forces that shape regulatory networks remain poorly understood. In mammals, the Rb pathway is a classic example of species-specific gene regulation, as a germline mutation in one Rb allele promotes retinoblastoma in humans, but not in mice. Here we show that p53 transactivates the Retinoblastoma-like 2 (Rbl2) gene to produce p130 in murine, but not human, cells. We found intronic fuzzy tandem repeats containing perfect p53 response elements to be important for this regulation. We next identified two other murine genes regulated by p53 via fuzzy tandem repeats: Ncoa1 and Klhl26. The repeats are poorly conserved in evolution, and the p53-dependent regulation of the murine genes is lost in humans. Our results indicate a role for the rapid evolution of tandem repeats in shaping differences in p53 regulatory networks between mammalian species

    Gene Properties and Chromatin State Influence the Accumulation of Transposable Elements in Genes

    Get PDF
    Transposable elements (TEs) are mobile DNA sequences found in the genomes of almost all species. By measuring the normalized coverage of TE sequences within genes, we identified sets of genes with conserved extremes of high/low TE density in the genomes of human, mouse and cow and denoted them as β€˜shared upper/lower outliers (SUOs/SLOs)’. By comparing these outlier genes to the genomic background, we show that a large proportion of SUOs are involved in metabolic pathways and tend to be mammal-specific, whereas many SLOs are related to developmental processes and have more ancient origins. Furthermore, the proportions of different types of TEs within human and mouse orthologous SUOs showed high similarity, even though most detectable TEs in these two genomes inserted after their divergence. Interestingly, our computational analysis of polymerase-II (Pol-II) occupancy at gene promoters in different mouse tissues showed that 60% of tissue-specific SUOs show strong Pol-II binding only in embryonic stem cells (ESCs), a proportion significantly higher than the genomic background (37%). In addition, our analysis of histone marks such as H3K4me3 and H3K27me3 in mouse ESCs also suggest a strong association between TE-rich genes and open-chromatin at promoters. Finally, two independent whole-transcriptome datasets show a positive association between TE density and gene expression level in ESCs. While this study focuses on genes with extreme TE densities, the above results clearly show that the probability of TE accumulation/fixation in mammalian genes is not random and is likely associated with different factors/gene properties and, most importantly, an association between the TE insertion/fixation rate and gene activity status in ES cells

    Towards an Evolutionary Model of Transcription Networks

    Get PDF
    DNA evolution models made invaluable contributions to comparative genomics, although it seemed formidable to include non-genomic features into these models. In order to build an evolutionary model of transcription networks (TNs), we had to forfeit the substitution model used in DNA evolution and to start from modeling the evolution of the regulatory relationships. We present a quantitative evolutionary model of TNs, subjecting the phylogenetic distance and the evolutionary changes of cis-regulatory sequence, gene expression and network structure to one probabilistic framework. Using the genome sequences and gene expression data from multiple species, this model can predict regulatory relationships between a transcription factor (TF) and its target genes in all species, and thus identify TN re-wiring events. Applying this model to analyze the pre-implantation development of three mammalian species, we identified the conserved and re-wired components of the TNs downstream to a set of TFs including Oct4, Gata3/4/6, cMyc and nMyc. Evolutionary events on the DNA sequence that led to turnover of TF binding sites were identified, including a birth of an Oct4 binding site by a 2nt deletion. In contrast to recent reports of large interspecies differences of TF binding sites and gene expression patterns, the interspecies difference in TF-target relationship is much smaller. The data showed increasing conservation levels from genomic sequences to TF-DNA interaction, gene expression, TN, and finally to morphology, suggesting that evolutionary changes are larger at molecular levels and smaller at functional levels. The data also showed that evolutionarily older TFs are more likely to have conserved target genes, whereas younger TFs tend to have larger re-wiring rates
    • …
    corecore